Straight out of undergrad, I was ready for some outdoor science. I started off with a short semi-volunteer field position in the early fall, and then headed home to the parents to apply for more.
I applied and applied… Not. A. Peep!
Once January rolled around, job offers finally started rolling in. Turned out, I had been applying in the wrong season. Had I done some data-sleuthing, I would have realized that winter is slow season for field jobs. So, now that I’m armed with those very skills, I am curious to explore when different types of ecology positions are posted on ecolog.
Let’s start by getting some data…
library(tidyverse)
library(lubridate)
library(knitr)
data <- read.csv("../data/ecolog_all.csv",
stringsAsFactors = FALSE)
Rather than doing something fancy to sort these posts, I used str_detect to check if the collection of terms that I associate with faculty/graduate positions or jobs appear in the subject. Here, I’ve written this into a function:
is_position <- function(subject){
subject %>%
str_to_lower() %>%
str_detect("job|hiring|employment|position|research associate|position|scientist|ecologist|researcher|technician|lead|assistant|fellow|educator|specialist|manager|biologist|coordinator|director|reu|undergrad| ms |msc|m.sc|master|graduate|ms|phd|ph.d|doctoral|graduate|post doc|postdoc|postdoctoral|professor|faculty|tenure|lecturer|instructor|temporary|intern|volunteer")
}
Now let’s apply it to the data and also categorize the posts into more specific groups, as well:
data_cat <- data %>%
as_tibble() %>%
filter(is_position(subject),
year(date) != 2018) %>%
#filter out 2018 because we only have partial data
mutate(month = month(date), #for counting purposes later
subject_low = str_to_lower(subject), #for str_detect case sensitivity
is_fieldwork = str_detect(subject_low, "field|seasonal"),
is_gradschool = str_detect(subject_low, " ms |msc|master|phd|ph.d|doctoral|graduate"),
is_faculty = str_detect(subject_low, "faculty|tenure|professor")) %>%
rowwise() %>%
mutate(is_other = ifelse(sum(is_fieldwork, is_gradschool, is_faculty) == 0,
TRUE, FALSE))
I imagine there won’t be any general signal, since position types are so varied.
data_cat %>%
group_by(month) %>%
summarize(count = n()) %>%
ggplot() +
geom_col(aes(x = as.factor(month), y = count)) +
xlab("Month") + ylab("Number of posts") +
theme_bw()
Turns out, I’m wrong and there’s a dip in the summer! Let’s take a look at how that breaks down to investigate this some more.